GTM-UVigo System for Multimodal Person Discovery in Broadcast TV Task at MediaEval 2016

نویسندگان

  • Paula Lopez-Otero
  • Laura Docío Fernández
  • Carmen García-Mateo
چکیده

In this paper, we present the system developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2016. The proposed approach consists in a novel strategy for person discovery which is not based on speaker and face diarisation as in previous works. In this system, the task is approached as a person recognition problem: there is an enrolment stage, where the voice and face of each discovered person are detected and, for each shot, the most suitable voice and face are assigned using the i-vector paradigm. These two biometric modalities are combined by decision fusion.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GTM-UVigo Systems for Person Discovery Task at MediaEval 2015

In this paper, we present the systems developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2015. The systems propose two different strategies for person discovery in audio through speaker diarization (one based on an online clustering strategy with error correction using OCR information and the other based on agglomerative hierarchical clustering) as ...

متن کامل

Tokyo Tech at MediaEval 2016 Multimodal Person Discovery in Broadcast TV task

This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2016 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system relies on face diarization approach. We extract deep features from a face every 0.5 secon...

متن کامل

Multimodal Person Discovery in Broadcast TV at MediaEval 2016

We describe the“Multimodal Person Discovery in Broadcast TV” task of MediaEval 2016 benchmarking initiative. Participants are asked to return the names of people who can be both seen as well as heard in every shot of a collection of videos. The list of people is not known a priori and their names has to be discovered in an unsupervised way from media content using text overlay or speech transcr...

متن کامل

PERCOLATTE : A Multimodal Person Discovery System in TV Broadcast for the Medieval 2015 Evaluation Campaign

This paper describes the PERCOLATTE participation to MediaEval 2015 task: “Multimodal Person Discovery in Broadcast TV” which requires developing algorithms for unsupervised talking face identification in broadcast news. The proposed approach relies on two identity propagation strategies both based on document chaptering and restricted overlaid names propagation rules. The primary submission sh...

متن کامل

Combining Audio Features and Visual I-Vector @ MediaEval 2015 Multimodal Person Discovery in Broadcast TV

This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2015 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system is based on multimodal approach to combine audio and visual informations. We extract feat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016